HalluciNet-ing Spatiotemporal Representations Using a 2D-CNN

نویسندگان

چکیده

Spatiotemporal representations learned using 3D convolutional neural networks (CNN) are currently used in state-of-the-art approaches for action-related tasks. However, 3D-CNN notorious being memory and compute resource intensive as compared with more simple 2D-CNN architectures. We propose to hallucinate spatiotemporal from a teacher student. By requiring the predict future intuit upcoming activity, it is encouraged gain deeper understanding of actions how they evolve. The hallucination task treated an auxiliary task, which can be any other multitask learning setting. Thorough experimental evaluation, shown that indeed helps improve performance on action recognition, quality assessment, dynamic scene recognition From practical standpoint, able without actual enable deployment resource-constrained scenarios, such limited computing power and/or lower bandwidth. also observed our has utility not only during training phase, but pre-training phase.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SURF-ing a Model of Spatiotemporal Saliency

Zhai and Shah (2006) proposed a model of spatiotemporal saliency using a combination of temporal and spatial attention models. The temporal model utilized Lowe’s SIFT (2004) to compute feature points and the correspondences between them in successive frames. Bay, Tuytelaars, & Van Gool introduced SURF (2006) as an alternative feature detector and descriptor. The authors of SURF show that it is ...

متن کامل

CNN Technology for Spatiotemporal Signal Processing

Copyright © 2009 David L ´ opez Vilariño et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Cellular Neural Networks (CNNs) are a paradigm for non-linear spatial-temporal dynamics and the core of the Cellular Wave Computing (a...

متن کامل

Matching 3D Shapes Using 2D Conformal Representations

Matching 3D shapes is a fundamental problem in Medical Imaging with many applications including, but not limited to, shape deformation analysis, tracking etc. Matching 3D shapes poses a computationally challenging task. The problem is especially hard when the transformation sought is diffeomorphic and non-rigid between the shapes being matched. In this paper, we propose a novel and computationa...

متن کامل

Visual Language Modeling on CNN Image Representations

Measuring the naturalness of images is important to generate realistic images or to detect unnatural regions in images. Additionally, a method to measure naturalness can be complementary to Convolutional Neural Network (CNN) based features, which are known to be insensitive to the naturalness of images. However, most probabilistic image models have insufficient capability of modeling the comple...

متن کامل

Marginalized CNN: Learning Deep Invariant Representations

Training a deep neural network usually requires sufficient annotated samples. The scarcity of supervision samples in practice thus becomes the major bottleneck on performance of the network. In this work, we propose a principled method to circumvent this difficulty through marginalizing all the possible transformations over samples, termed as marginalized Convolutional Neural Network (mCNN). mC...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Signals

سال: 2021

ISSN: ['2624-6120']

DOI: https://doi.org/10.3390/signals2030037